Web Log Mining Based-on Improved Double-Points Crossover Genetic Algorithm

نویسنده

  • Jin Xie
چکیده

Web log files have become important data source for discoveries of user behaviors. Analyzing web log files is one of the significant research fields of web mining. This paper proposes an improved double-points crossover genetic algorithm for mining user access patterns from web log files. Our work contains three different components. First, we design a coding rule according to pre-processed web log data. Second, a fitness function is presented by analyzing user sessions. Finally, a new genetic algorithm based on double-points crossover genetic algorithm is designed. In comparison with simple genetic algorithm, double-points crossover genetic algorithm demonstrates better convergence than the other, and it is more suitable for web log mining. We conducted an experiment to verify the effectiveness of the proposed algorithm. The results show that the proposed algorithm helps the website to easily gain access patterns.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Technique for Improving Web Mining using Enhanced Genetic Algorithm

World Wide Web is growing at a very fast pace and makes a lot of information available to the public. Search engines used conventional methods to retrieve information on the Web; however, the search results of these engines are still able to be refined and their accuracy is not high enough. One of the methods for web mining is evolutionary algorithms which search according to the user interests...

متن کامل

QoS-Based web service composition based on genetic algorithm

Quality of service (QoS) is an important issue in the design and management of web service composition. QoS in web services consists of various non-functional factors, such as execution cost, execution time, availability, successful execution rate, and security. In recent years, the number of available web services has proliferated, and then offered the same services increasingly. The same web ...

متن کامل

Cosine Similarity Measure and Genetic Algorithm for extracting main content from web documents

Because of the use of growing information, web mining has become a primary necessity of world. Due to this, research on web mining has received a lot of interest from both industry and academia. Mining and prediction of user’s web browsing behaviors and deducing the actual content in a web document is one of the active subjects. The information on web is dirty. Apart from useful information, it...

متن کامل

Constraint Informative Rules for Genetic Algorithm-based Web Page Recommendation System

To predict the users navigation using web usage mining is the primary motto of the web page recommendation. Currently, researchers are trying to develop a web page recommendation using pattern mining technique. Here, we propose a technique for web page recommendation using genetic algorithm. It consists of three phases as data preparation, mining of informative rules and recommendation. The dat...

متن کامل

Automatic discovery of the sequential accesses from web log data files via a genetic algorithm

This paper is concerned with finding sequential accesses from web log files, using ‘Genetic Algorithm’ (GA). Web log files are independent from servers, and they are ASCII format. Each transaction, whether completed or not, is recorded in the web log files and these files are unstructured for knowledge discovery in database techniques. Data which is stored in web logs have become important for ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of Multimedia

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2014